Skip to content

Conversation

@changpeng
Copy link
Contributor

Instructions like load transpose use this predicate.

  Instructions like load transpose use this predicate.
@llvmbot
Copy link
Member

llvmbot commented Jul 28, 2025

@llvm/pr-subscribers-backend-amdgpu

Author: Changpeng Fang (changpeng)

Changes

Instructions like load transpose use this predicate.


Full diff: https://github.com/llvm/llvm-project/pull/151002.diff

1 Files Affected:

  • (modified) llvm/lib/Target/AMDGPU/FLATInstructions.td (+1)
diff --git a/llvm/lib/Target/AMDGPU/FLATInstructions.td b/llvm/lib/Target/AMDGPU/FLATInstructions.td
index 7207c251994ad..65878b4796a6a 100644
--- a/llvm/lib/Target/AMDGPU/FLATInstructions.td
+++ b/llvm/lib/Target/AMDGPU/FLATInstructions.td
@@ -168,6 +168,7 @@ class VFLAT_Real <bits<8> op, FLAT_Pseudo ps, string opName = ps.Mnemonic> :
   let WaveSizePredicate    = ps.WaveSizePredicate;
   let AsmMatchConverter    = ps.AsmMatchConverter;
   let OtherPredicates      = ps.OtherPredicates;
+  let WaveSizePredicate    = ps.WaveSizePredicate;
   let TSFlags              = ps.TSFlags;
   let UseNamedOperandTable = ps.UseNamedOperandTable;
   let SchedRW              = ps.SchedRW;

@shiltian
Copy link
Contributor

and there is no test case needed for this one?

@changpeng
Copy link
Contributor Author

and there is no test case needed for this one?

Right, there is no test needed at this moment.

For gfx1200, the opcode is supported with both wave sizes, so the assembler only reported invalid operand:
global_load_tr_b128 v[1:4], v0, s[0:1] offset:-64
// W64-ERR: :[[@line-1]]:{{[0-9]+}}: error: operands are not valid for this GPU or mode
// W32: encoding: [0x00,0xc0,0x15,0xee,0x01,0x00,0x00,0x00,0x00,0xc0,0xff,0xff]

For gfx1250, it already reported wavesize error:
global_load_tr8_b64 v[2:3], v0, s[0:1]
// GFX1250: global_load_tr8_b64 v[2:3], v0, s[0:1] ; encoding: [0x00,0x00,0x16,0xee,0x02,0x00,0x00,0x00,0x00,0x00,0x00,0x00]
// WAVESIZE-ERR: :[[@LINE-2]]:{{[0-9]+}}: error: instruction requires wavesize=32

@shiltian
Copy link
Contributor

shiltian commented Jul 28, 2025

but in that case why do we need this then?

You are right. VFLAT_Real already copied it (so not needed). My intention was to add the copy for FLAT_Real class to be in-sync with the downstream branch. Thanks for the question. I am going to abandon this PR

@changpeng changpeng closed this Jul 28, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants